Chromatin Immunoprecipitation Sequencing    ◾    235

Figure 6.12 shows that Poly II localization is centered in the TSS where most peaks are observed.

6.3.7  Peak Annotation

We will continue using R to perform annotation of the peaks called with MACS3 program

that stored the peak information in “*peaks.narrowPeak” files. The peaks represent the

most likely locations of protein–DNA interaction in the genome (the content of “*peaks.­

narrowPeak” files is discussed above). The main goal of ChIP-Seq data analysis is to inves-

tigate the biological implications of the epigenomic changes like genomic binding sites of

proteins such as TFs, histones, and Poly II. Annotation of the protein–DNA interaction

sites and functions will provide important information about the biological implications.

Peak annotation is the process of associating the sites identified by the peaks to the genes

and region of the genes affected by the epigenetic change. Most interactions like TFs, ini-

tial localization of Poly II and histones occur in the cis-regulatory site of the gene which

is close to TSS and it includes a promoter, an enhancer, a silencer, insulators, etc., which

play crucial roles in controlling gene expressions in specific cell types, conditions, and

developmental stages. An annotation program annotates ChIP-Seq peaks by associating

these peaks to the closest TSS of a gene, either upstream or downstream. A cis-regulatory

region can also be in distance from the TSS or between the TSSs of two different genes.

We will continue using R Bioconductor packages and “*peaks.narrowPeak” files as

inputs for annotation. The following codes create a list of the sample file names and a label

for each sample as “chip1”, “chip2”, and “chip3” respectively:

bedfiles <- list.files(“vis”, pattern= “.bed”, full.names=T)

bedfiles <- as.list(bedfiles)

names(bedfiles) <- c(“chip1”, “chip2”, “chip3”)

Then, we can assign the database of the known human genes to a variable so that we can

use the annotation information and associate them to the peaks.

FIGURE 6.12  Average profile of ChIP-Seq peaks across.